Improved TDNNs using Deep Kernels and Frequency Dependent Grid-RNNs
نویسندگان
چکیده
Time delay neural networks (TDNNs) are an effective acoustic model for large vocabulary speech recognition. The strength of the model can be attributed to its ability to effectively model long temporal contexts. However, current TDNN models are relatively shallow, which limits the modelling capability. This paper proposes a method of increasing the network depth by deepening the kernel used in the TDNN temporal convolutions. The best performing kernel consists of three fully connected layers with a residual (ResNet) connection from the output of the first to the output of the third. The addition of spectro-temporal processing as the input to the TDNN in the form of a convolutional neural network (CNN) and a newly designed GridRNN was investigated. The Grid-RNN strongly outperforms a CNN if different sets of parameters for different frequency bands are used and can be further enhanced by using a bi-directional Grid-RNN. Experiments using the multi-genre broadcast (MGB3) English data (275h) show that deep kernel TDNNs reduces the word error rate (WER) by 6% relative and when combined with the frequency dependent Grid-RNN gives a relative WER reduction of 9%.
منابع مشابه
Experimental Comparison of Fuzzy and Neural Network Techniques in Learning Models of the Central Nervous System Control
The aim of this work was to apply the fuzzy inductive reasoning (FIR) methodology and both time-delay and recurrent neural networks (TDNNs and RNNs) to induce models of the central nervous system (CNS) control that accurately represents the input/output behavior available from observations of a particular patient. A comparative study of these approaches from the point of view of the predictiven...
متن کاملHierarchical Temporal Representation in Linear Reservoir Computing
Recently, studies on deep Reservoir Computing (RC) highlighted the role of layering in deep recurrent neural networks (RNNs). In this paper, the use of linear recurrent units allows us to bring more evidence on the intrinsic hierarchical temporal representation in deep RNNs through frequency analysis applied to the state signals. The potentiality of our approach is assessed on the class of Mult...
متن کاملAdaptive Setting of UFLS Relay Using Hourly Programming with Consideration of Renewable Energy Sources in Smart Grid
In the light of the emergence of smart grids, the functions associated with this type of grids in the blocks of the energy management system require the adoption of robust strategies in order to provide a higher level of control and protection. Under-frequency load shedding (UFLS) sheds load blocks when the frequency drop is below the threshold limit. In adaptive UFLS, in an advanced telecommun...
متن کاملRegionally optimised kernels for time-frequency distributions
Ideally, kernels used to generate bilinear time-frequency distributions (TFD) should be signal-dependent, and optimised independently at every location in the time-frequency (TF) plane. This poses an extremely severe computational burden. A compromise is proposed in this paper: time-varying kernels are optimised for specific regions in the time-frequency plane. The regions, designed to isolate ...
متن کاملDeep Collective Inference
Collective inference is widely used to improve classification in network datasets. However, despite recent advances in deep learning and the successes of recurrent neural networks (RNNs), researchers have only just recently begun to study how to apply RNNs to heterogeneous graph and network datasets. There has been recent work on using RNNs for unsupervised learning in networks (e.g., graph clu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.06412 شماره
صفحات -
تاریخ انتشار 2018